Python爬虫实现翻译功能
原创
Hu_ny
2021-08-04 11:42:55
博主文章分类:Python
©著作权
文章标签
json
html
ide
f5
python
文章分类
Python
编程语言
©著作权归作者所有:来自51CTO博客作者Hu_ny的原创作品,请联系作者获取转载授权,否则将追究法律责任Python爬虫实现翻译功能https://blog.51cto.com/huny/3264956
前言
学了这么久的python理论知识,需要开始实战来练手巩固了。
准备
首先安装爬虫urllib库
pip install urllib
获取有道翻译的链接url
需要发送的参数在form data里![Python爬虫实现翻译功能_json_02](https://s9.51cto.com/images/blog/202108/04/83485a216329465216a6b2027c975e2a.png?x-oss-process=image/watermark,size_16,text_QDUxQ1RP5Y2a5a6i,color_FFFFFF,t_100,g_se,x_10,y_10,shadow_90,type_ZmFuZ3poZW5naGVpdGk=)
示例
import urllib.request
import urllib.parse
url = 'http://fanyi.youdao.com/translate_o?smartresult=dict&smartresult=rule'
data = {}
data['i'] = 'i love python'
data['from'] = 'AUTO'
data['to'] = 'AUTO'
data['smartresult'] = 'dict'
data['client'] = 'fanyideskweb'
data['salt'] = '16057996372935'
data['sign'] = '0965172abb459f8c7a791df4184bf51c'
data['lts'] = '1605799637293'
data['bv'] = 'f7d97c24a497388db1420108e6c3537b'
data['doctype'] = 'json'
data['version'] = '2.1'
data['keyfrom'] = 'fanyi.web'
data['action'] = 'FY_BY_REALTlME'
data = urllib.parse.urlencode(data).encode('utf-8')
response = urllib.request.urlopen(url,data)
html = response.read().decode('utf-8')
print(html)
运行会出现50的错误,这里需要将url链接的_o删除掉
删除后运行成功
但是这个结果看起来还是太复杂,需要在进行优化
导入json,然后转换成字典进行过滤
import urllib.request
import urllib.parse
import json
url = 'http://fanyi.youdao.com/translate?smartresult=dict&smartresult=rule'
data = {}
data['i'] = 'i love python'
data['from'] = 'AUTO'
data['to'] = 'AUTO'
data['smartresult'] = 'dict'
data['client'] = 'fanyideskweb'
data['salt'] = '16057996372935'
data['sign'] = '0965172abb459f8c7a791df4184bf51c'
data['lts'] = '1605799637293'
data['bv'] = 'f7d97c24a497388db1420108e6c3537b'
data['doctype'] = 'json'
data['version'] = '2.1'
data['keyfrom'] = 'fanyi.web'
data['action'] = 'FY_BY_REALTlME'
data = urllib.parse.urlencode(data).encode('utf-8')
response = urllib.request.urlopen(url,data)
html = response.read().decode('utf-8')
req = json.loads(html)
result = req['translateResult'][0][0]['tgt']
print(result)
但是这个程序只能翻译一个单词,用完就废了。于是我在进行优化
import urllib.request
import urllib.parse
import json
def translate():
centens = input('输入要翻译的语句:')
url = 'http://fanyi.youdao.com/translate?smartresult=dict&smartresult=rule'
head = {}#增加请求头,防反爬虫
head['User-Agent'] = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36'
data = {}#带上from data的数据进行请求
data['i'] = centens
data['from'] = 'AUTO'
data['to'] = 'AUTO'
data['smartresult'] = 'dict'
data['client'] = 'fanyideskweb'
data['salt'] = '16057996372935'
data['sign'] = '0965172abb459f8c7a791df4184bf51c'
data['lts'] = '1605799637293'
data['bv'] = 'f7d97c24a497388db1420108e6c3537b'
data['doctype'] = 'json'
data['version'] = '2.1'
data['keyfrom'] = 'fanyi.web'
data['action'] = 'FY_BY_REALTlME'
data = urllib.parse.urlencode(data).encode('utf-8')
req = urllib.request.Request(url,data,head)
response = urllib.request.urlopen(req)
html = response.read().decode('utf-8')
req = json.loads(html)
result = req['translateResult'][0][0]['tgt']
# print(f'中英互译的结果:{result}')
return result
t = translate()
print(f'中英互译的结果:{t}')
优化完成,效果还行。![Python爬虫实现翻译功能_json_06](https://s4.51cto.com/images/blog/202108/04/551c7ae802f1d24a9fa5f2a4003066da.png?x-oss-process=image/watermark,size_16,text_QDUxQ1RP5Y2a5a6i,color_FFFFFF,t_100,g_se,x_10,y_10,shadow_90,type_ZmFuZ3poZW5naGVpdGk=)
赞
收藏
评论
分享
微博
QQ
微信
举报
上一篇:Python GUI之Tkiner实战
下一篇:Python装饰器
|